251 research outputs found

    High-Performance Distributed ML at Scale through Parameter Server Consistency Models

    Full text link
    As Machine Learning (ML) applications increase in data size and model complexity, practitioners turn to distributed clusters to satisfy the increased computational and memory demands. Unfortunately, effective use of clusters for ML requires considerable expertise in writing distributed code, while highly-abstracted frameworks like Hadoop have not, in practice, approached the performance seen in specialized ML implementations. The recent Parameter Server (PS) paradigm is a middle ground between these extremes, allowing easy conversion of single-machine parallel ML applications into distributed ones, while maintaining high throughput through relaxed "consistency models" that allow inconsistent parameter reads. However, due to insufficient theoretical study, it is not clear which of these consistency models can really ensure correct ML algorithm output; at the same time, there remain many theoretically-motivated but undiscovered opportunities to maximize computational throughput. Motivated by this challenge, we study both the theoretical guarantees and empirical behavior of iterative-convergent ML algorithms in existing PS consistency models. We then use the gleaned insights to improve a consistency model using an "eager" PS communication mechanism, and implement it as a new PS system that enables ML algorithms to reach their solution more quickly.Comment: 19 pages, 2 figure

    Modeling and Detecting Network Communities with the Fusion of Node Attributes

    Full text link
    As a fundamental structure in real-world networks, communities can be reflected by abundant node attributes with the fusion of graph topology. In attribute-aware community detection, probabilistic generative models (PGMs) have become the mainstream fusion method due to their principled characterization and interpretation. Here, we propose a novel PGM without imposing any distributional assumptions on attributes, which is superior to existing PGMs that require attributes to be categorical or Gaussian distributed. Based on the famous block model of graph structure, our model fuses the attribute by describing its effect on node popularity using an additional term. To characterize the effect quantitatively, we analyze the detectability of communities for the proposed model and then establish the requirements of the attribute-popularity term, which leads to a new scheme for the model selection problem in attribute-aware community detection. With the model determined, an efficient algorithm is developed to estimate the parameters and to infer the communities. The proposed method is validated from two aspects. First, the effectiveness of our algorithm is theoretically guaranteed by the detectability condition, whose correctness is verified by numerical experiments on artificial graphs. Second, extensive experiments show that our method outperforms the competing approaches on a variety of real-world networks.Comment: other authors do not want to preprin

    Petuum: A New Platform for Distributed Machine Learning on Big Data

    Full text link
    What is a systematic way to efficiently apply a wide spectrum of advanced ML programs to industrial scale problems, using Big Models (up to 100s of billions of parameters) on Big Data (up to terabytes or petabytes)? Modern parallelization strategies employ fine-grained operations and scheduling beyond the classic bulk-synchronous processing paradigm popularized by MapReduce, or even specialized graph-based execution that relies on graph representations of ML programs. The variety of approaches tends to pull systems and algorithms design in different directions, and it remains difficult to find a universal platform applicable to a wide range of ML programs at scale. We propose a general-purpose framework that systematically addresses data- and model-parallel challenges in large-scale ML, by observing that many ML programs are fundamentally optimization-centric and admit error-tolerant, iterative-convergent algorithmic solutions. This presents unique opportunities for an integrative system design, such as bounded-error network synchronization and dynamic scheduling based on ML program structure. We demonstrate the efficacy of these system designs versus well-known implementations of modern ML algorithms, allowing ML programs to run in much less time and at considerably larger model sizes, even on modestly-sized compute clusters.Comment: 15 pages, 10 figures, final version in KDD 2015 under the same titl

    LTNE approach and simulation for anode-supported SOFCs

    Get PDF
    Fuel cells are promising for future energy systems, since they are energy efficient and, when hydrogen is used as fuel, there are no emissions of greenhouse gases. Fuel cells have during recent years various improvements, however the technology is still in the early phases of development, this can be noted by the lack of dominant design both for singe fuel cells, stacks and for entire fuel cell systems. In this study a CFD approach (COMSOL Multiphysics) is employed to investigate the effect on temperature distribution from inlet temperature, oxygen surplus, ionic conductivity and current density for an anode-supported intermediate temperature solid oxide fuel cell (IT-SOFC). The developed model is based on the governing equations of heat-, mass- and momentum transport. A local temperature non equilibrium (LTNE) approach is introduced to calculate the temperature distribution in the gas- and solid phase separately. The results show that the temperature increasing along the flow direction is controlled by the degree of surplus air. It is also found that the ohmic polarization in the electrolyte and the activation polarization in the anode and cathode have major influence on the performance. If a count flow approach is employed the inlet temperature for the fuel stream should be close to the outlet temperature for the air flow to avoid a too high temperature gradient

    Reinforcement learning based anti-jamming schedule in cyber-physical systems

    Get PDF
    In this paper, the security issue of cyber-physical systems is investigated, where the observation data is transmitted from a sensor to an estimator through wireless channels disturbed by an attacker. The failure of this data transmission occurs, when the sensor accesses the channel that happens to be attacked by the jammer. Since the system performance measured by the estimation error depends on whether the data transmission is a success, the problem of selecting the channel to alleviate the attack effect is studied. Moreover, the state of each channel is time-variant due to various factors, such as path loss and shadowing. Motivated by energy conservation, the problem of selecting the channel with the best state is also considered. With the help of cognitive radio technique, the sensor has the ability of selecting a sequence of channels dynamically. Based on this, the problem of selecting the channel is resolved by means of reinforcement learning to jointly avoid the attack and enjoy the channel with the best state. A corresponding algorithm is presented to obtain the sequence of channels for the sensor, and its effectiveness is proved analytically. Numerical simulations further verify the derived results

    Hairpin DNA functionalized gold nanorods for mRNA detection in homogenous solution

    Get PDF
    We report a novel fluorescent probe for mRNA detection. It consists of a gold nanorod (GNR) functionalized with fluorophore labeled hairpin oligonucleotides (hpDNA) that are complementary to the mRNA of a target gene. This nanoprobe was found to be sensitive to a complementary oligonucleotide, as indicated by significant changes in both fluorescence intensity and lifetime. The influence of the surface density of hpDNA on the performance of this nanoprobe was investigated, suggesting that high hybridization efficiency could be achieved at a relatively low surface loading density of hpDNA. However, steady-state fluorescence spectroscopy revealed better overall performance, in terms of sensitivity and detection range, for nanoprobes with higher hairpin coverage. Time-resolved fluorescence lifetime spectroscopy revealed significant lifetime changes of the fluorophore upon hybridization of hpDNA with targets, providing further insight on the hybridization kinetics of the probe as well as the quenching efficiency of GNRs
    • …
    corecore